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b. □ have been communicated by the International Bureau. 
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8. □ A English language translation of the amendments to the claims under PCT Article 19 (35 U.S.C. 371(c)(3)). 
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36 (35 U.S.C. 371(c)(5)). 
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□ 
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with PCT Rule 13ter.2 and 35 U.S.C. 1.821 - 1.825. 

A second copy of the published international apphca 
U.S.C. 154(d)(4). 

A second copy of the English language translation o 
international application under 35 U.S.C. 154(d)(4). 
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PTO Form 1449, references for IDS, 4 sheets of drawings 
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INTERNATIONAL APPLICATION NO 
PCT/NZ99/00227 



ATTORNEY'S DOCKET NO. 
514274-2001 



21. The following fees are submitted 


CALCULATIONS PTO USE ONLY 


BASIC NATIONAL FEE (37 CFR 1.492(a)(l)-(5): 

Neither international preliminary examination fee (37 CFR 1 .482) 

Nor international search fee (37CFR 1.445(a)(2) paid to USPTO 

And International Search Report not prepared by the EPO or JPO 


$1000 00 




International preliminary examination fee (37 C.F.R. 1 482) not paid to 
USPTO but International Search Report prepared by the EPO or JPO 


$860.00 




International preliminary examination fee (37 CFR 1 482) not paid to USPTO but 

International search fee (37 CFR 1 445(a)(2)) paid to USPTO $710.00 




International preliminary examination fee paid to USPTO (37 CFR 1.482) 
But all claims did not satisfy provisions of PCT Article 33(1 )-(4 


$690.00 




International preliminary examination fee paid to USPTO (37 CFR 1 ,482) 
And all claims satisfied provisions of PCT Article 33(1 )-(4) 


$100.00 




ENTER APPROPRIATE BASIC FEE AMOUNT = 


$ 860.00 




Surcharge of $130.00 for furnishing the oath or declaration later than Q 20 □ 30 
Months from the earliest claimed priority date (37 CFR 1.492(e)). 


$ 




CLAIMS 


NUMBER FILED 


NUMBER EXTRA 


RATE 


$ 


Total Claims 


24-20 = 


4 


x $18.00 


$ 72.00 




Independent Claims 


4- 3 = 


1 


x $80.00 


$ 80.00 




MULTIPLE DEPENDENT CLAIM(S) (if applicable) 


+ $270.00 


$ 




TOTAL OF ABOVE CALCULATIONS = 


$ 1,012.00 




□ Applicant claims small entity status. See 37 C.F.R. 1.27. The fees indicated 
above are reduced by Vi. + 


$ 




SUBTOTAL = 


$ 1,012.00 




Processing fee of $130.00 for furnishing the English translation later than □ 20 □ 30 
Months from the earliest claimed priority date (37 CFR 1.492(f)). 


$ 




TOTAL NATIONAL FEE = 


$ 1,012.00 




Fee for recording the enclosed assignments (37 CFR 1.21(h)). The assignment must be 
accompanied by an appropriate cover sheet (37 CFR 3.28,3.31 ). $40.00 per property + 


$ 




TOTAL FEES ENCLOSED = 


$ 1,012.00 












Amount to be 
refunded: 


$ 










Charged: 


$ 



I A check in the amount of $ 1,012.00 to cover the above fees is enclosed. 



b. O Please charge my Deposit Account No. . 



. in the amount of $_ 



to cover the above fees. 



c. 



A duplicate copy of this sheet is enclosed. 

| The Commissioner is hereby authorized to charge any additional fees which may be required, or credit any 
overpayment to Deposit Account No. 50-0320 . A duplicate copy of this sheet is enclosed. 



d. Fees are to be charged to a credit card. WARNING: Information on this form may become public. Credit 
card information should not be included on this form. Provide credit card information and authorization 
on PTO-2038. 

NOTE: Where an appropriate time limit under 37 CFR 1.494 or 1^95 has not been met, a petition to revive (37 CFR 
1.137(a) or (b)) must be filed and granted to restore the application to pending status] 



SEND ALL CORRESPONDENCE TO: 



William F. Lawrence, Esq. 
FROMMER LAWRENCE & HAUG LLP 
745 Fifth Avenue 
New York, NY 10151 



Date: June 21. 2001 



SIGNATURE 
William F. Lawrence, Esq. 
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PATENT 
514274-2001 
09/868,760 



Applicants 
U.S. Serial No. 
Filing Date 
For 

Examiner 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
: SCOTTI, et al 
: 09/868,760 
: June 21, 2001 

: SERINE PROTEASE INHIBITORS 

: To be Assigned 

745 Fifth Avenue, New York, NY 10151 



EXPRESS MAIL 

Mailing Label Number: EL 819163957 US 



Date of Deposit: 



July 11,2001 



I hereby certify that this paper or fee is being deposited 
with the United States Postal Service "Express Mail Post 
Office to Addressee" Service under 37 CFR 1.10 on the 
date indicated above and is addressed to: Assistant 
Commissioner for Patents, Washington, DC 2023 1 . 



(Typed or printed name of person mailing paper or fee) 



(Signature of person mailing paper or fee) 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Box Sequence 
Washington, D.C. 20231 

Dear Sir: 

Applicants respectfully request acceptance of the enclosed paper copy and computer 
readable form of the Sequence Listing. It is also respectfully requested that the application be 
amended as follows: 
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9/8687&0 



PATENT 
514274-2001 
09/868,760 



IN THE SPECIFICATION: 

Please amend the specification as follows: 

Please replace the paragraph beginning at page 17, line 23, with the following rewritten 
paragraph 

-The sequence data was then compared with amino acid sequences in searchable computer 
databases. Some sequences were found to be of particular interest: 

a) a 1 0 amino acid residue sequence from he N-terminus of pernin (sequence (a) above showed 
only homology with an 8- base anti-thrombin protein sequence (SEQ ID NO: 9) from the 
terrestrial leeches (data from US patent 5,455,181 Oct 3, 1995: sequence 10).-; 

Please replace the paragraph beginning at page 20, line 5, with the following rewritten 

paragraph: 

--A suite of non-specific primers called pUZ5 was synthesized by Gibco-BRL for the initial 
sequencing based on the N-terminal sequence of pernin. The general formula was 
GAY GGN GAR CAR TGY AAY GAY GGN CAR AA (SEQ ID NO: 10) 
Where Y represents a pyrimidine base, R represents a purine base and N represents any one of 
the four-nucleotide bases. Sequencing was done, initially using pUZ5 and an oligo-dT based 
"bottom strand" primer from PCR amplified cDNA. Sequencing was done by dye-termination 
cycle sequencing using "BigDye" prism technology (Applied Biosystems Incorporated, USA) 
according to their instructions. Products were resolved on an ABI 377 automated sequencer. 
Following the initial sequencing of approximately 500 base pairs pernin-specific primers were 
constructed and used to complete the sequencing of the pernin gene.-- 

Immediately after page 23 and before the first page of claims (page24) ? if appropriate, 
please insert the enclosed pages identified as --Sequence Listing--. Please renumber the pages 
accordingly. 



It is respectfully asserted that the sequence disclosure contained in the application now 
fully complies with the requirements set forth in 37 C.F.R. § 1.821 to § 1.825. Attached hereto is 
a marked-up version of the changes made to the specification and claims by the current 
amendment. The attached page is captioned "Version with markings to show changes made." 



REMARKS 
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PATENT 

^ 514274-2001 
JUL 1 * Wi t 09/868,760 

^ Wise amendments are introduced merely to assign the correct SEQ ID NO: and 

to placetBenucleotide sequence listing in the application, (after the specification and before the 
claims). It is respectfully asserted that these'amendments do not add any new matter. 

In view of the amendments, remarks and enclosures, the application complies with the 
requirements for computer readable disclosure of the biological sequences under 37 C.F.R. 
§ 1 .82 1 - 1 .825 . This response is being submitted without a formal Notice to Comply. 

If any additional fees are incurred for entry and consideration of this Amendment, the 
Examiner is authorized to charge any fees or credit any overpayment to Deposit Account No. 50- 
0320. 

Respectfully submitted, 

FROMMER LAWRENCE & HAUG LLP 



By: 



Susan K. Lehnhardt 
Reg. No. 33,943 
(212) 588-0800 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 
In the specification: 

Paragraph beginning at page 17, line 23 has been amended as follows: 

The sequence data was then compared with amino acid sequences in searchable computer 

databases. Some sequences were found to be of particular interest: 

a) a 10 amino acid residue sequence from he N-terminus of pernin (sequence (a) above 

showed only homology with an 8- base anti-thrombin protein sequence (SEP ID NO: 9) 
from the terrestrial leeches (data from US patent 5,455,181 Oct 3, 1995: sequence 10). 

The paragraph beginning at page 20, line 5 has been amended as follows: 

A suite of non-specific primers called pUZ5 was synthesized by Gibco-BRL for the initial 

*D sequencing based on the N-terminal sequence of pernin. The general formula was 

J GAY GGN GAR PAR TOY A AY GAY GGN CAR AA (SEP ID NO: 10) 

Where Y represents a pyrimidine base, R represents a purine base and N represents any one of 
13 the four-nucleotide bases. Sequencing was done, initially using pUZ5 and an oligo-dT based 
f "bottom strand" primer from PCR amplified cDNA. Sequencing was done by dye-termination 

y, cycle sequencing using "BigDye" prism technology (Applied Biosystems Incorporated, USA) 
U according to their instructions. Products were resolved on an ABI 377 automated sequencer. 
If Following the initial sequencing of approximately 500 base pairs pernin-specific primers were 
constructed and used to complete the sequencing of the pernin gene. 
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^ggflBE UNITED STATES PATENT AND TRADEMARK OFFICE 
Applicants : SCOTTI, et al 

U.S. Serial No. : 09/868,760 

Filing Date : June 21, 2001 

For : SERINE PROTEASE INHIBITORS 



745 Fifth Avenue, New York, NY 10151 



EXPRESS MAIL 



Mailing Label Number: EL 819163957 US 
Date of Deposit: July 11, 2001 



I hereby certify that this paper or fee is being deposited with the 
United States Postal Service "Express Mail Post Office to 
Addressee" Service under 37 CFR 1.10 on the date indicated 
above and is addressed to: Assistant Commissioner for Patents, 
Washington, DC 20231. 



Assistant Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

The undersigned hereby states that the content of the printed Sequence Listing for the 
above-referenced application filed, and the computer readable copy, submitted in accordance 
with 37 CFR §§ 1.821(c) and (e), are the same and do not add any new matter. 

In the unlikely event that the Patent Office determines that an extension and/or other 
relief is required as a result of this statement, applicants petition for any required relief including 
extensions of time and authorize the Assistant Commission to charge the cost of such petitions 
and/or other fees due to our Deposit Account No.: 50-0320 . 




(Typed or printed name of person mailing paper or fee) 




(Signature of person mailing paper or fee) 



Statement to Support Filing and Submission 
in Accordance With 37 CFR §§ 1,821-1.825 




Respectfully submitted, 
FROMMER LAWRENCE & HAUG LLP 



Susan K. Lehnhardt 
Reg. No. 33,943 
(212) 588-0800 
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SEQUENCE LISTING 



The Horticulture and Food Research Institute of New Zealand Limited 

<120> Serine Protease Inhibitor 

<130> 514274-2001 

<150> PCT/NZ99/00227 
<151> 1999-12-23 

<150> NZ 336906 
<151> 1999-07-23 

<150> NZ 333568 
<151> 1998-12-23 

<160> 10 

<170> Patentln version 3.0 

<210> 1 
<211> 10 
<212> PRT 

<213> Perna canaliculus 
<400> 1 

Asp Gly Glu Gin Cys Asn Asp Gly Gin Asn 
15 10 

<210> 2 
<211> 19 
<212> PRT 

<213> Perna canaliculus 
<400> 2 

Gin Gly Gly His Glu Val Glu Ser Glu Arg Val Ala Cys Cys Val lie 
15 10 15 

Gly Arg Ala 



<210> 3 

<211> 9 

<212> PRT 

<213> Perna canaliculus 

<400> 3 

Gly Gin Ser His Pro Glu lie Val His 
1 5 



<210> 4 
<211> 7 
<212> PRT 



<213> Perna canaliculus 

<400> 4 

Tyr His Gly His Asp Asp Ala 
1 5 

<210> 5 

<211> 7 

<212> PRT 

<213> Perna canaliculus 

<400> 5 

Val Val Asn Glu Val His His 
1 5 

<210> 6 

<211> 1491 

<212> DNA 

<213> Perna canaliculus 

<400> 6 



gayggggagc 


agtgtaacga 


tgggcagaac 


aaagatgacc 


accatgacga 


ccaccacgat 


60 


gatcaccatg 


acgaccatga 


tgatgatgat 


gaaacaatgc 


actatgccca 


gtgtgaaatg 


120 


gaaccaaacc 


ctcatatggc 


tagcagcctt 


caccaccatg 


tccatggcag 


catagagttg 


180 


tcacagaagg 


gtcatggagc 


tgtttatcta 


gaacttcatc 


ttgtcggatt 


caacacaagt 


240 


gaagaccatg 


acgaccacca 


tcatggactt 


catctgcaca 


tgcttggtga 


catgtcagca 


300 


ggttgtgatt 


ctattggcga 


actgtacaat 


gctcacccag 


aaaaacatgc 


tgaccctggt 


360 


gacctcggtg 


acctggttga 


cgatgatagg 


ggcgtggtta 


atgaagttca 


tcattatgct 


420 


tggttggaca 


ttgatggtac 


agcaccaaac 


accgaagctc 


tcattggaca 


ctcaatgact 


480 


attttacaag 


ggagtcacac 


cgatgctgat 


accccagcca 


gtagaatcgc 


ctgttgtgtt 


540 


attggtcatg 


gaaaagctcg 


cccagaaaca 


gcagctgctc 


tacatcacga 


gctagaggaa 


600 


gataaaactg 


agcattatgc 


ccattgtgac 


gtaagatcta 


atacacacca 


accaaaggct 


660 


cttcatcatc 


atgtccacgg 


aaccatcgat 


ttcaaacaag 


ttggttatgg 


tgaccttgaa 


720 


gtgtcctacc 


atttagaggg 


atttaatgta 


agtgatgacc 


acaaagatca 


tctccatgac 


780 


gtacagatct 


acgccaacgg 


tgacctgacc 


agtggatgtg 


ataacctcgg 


tgctaaatat 


840 


gatcctcatg 


aagattacca 


cagtgagttg 


ggtgatctag 


gagatattca 


cgatgatgac 


900 


catggcgttg 


tcaatgaaag 


ccacagatat 


tcctggatca 


atatcttcgg 


tgatgacagt 


960 


gtcctgggac 


gttctattgc 


cattcaccaa 


agagaccatc 


ttcataaaag 


tgccaaaatt 


1020 



gcctgttgtg tcataggacg tggacagagc catccagaaa ttgttcacag agctaaatgt 1080 

gttgtcagac ctaatacaga atctactggt ttacatcacc atgtctctgg ttctataaca 1140 

ttcgaacaga cccctggagg atcaacacat atgacggctg atctcaaagg atttaacgtt 1200 

agtgaggact tgtcacatca tcgtcatggt gtgcagctcc atgaatgggg agatatgtcc 1260 

catggctgtc actccttagg cagaatgtac catggtcatg atgatgctca tgaccccaaa 1320 

agacctggtg accttggtga tgttatagat gattcccatg gcatcgttca ttcaactaga 1380 

acctttgatc atcttaatgt tgaagatctt aacgcacgtt cccttgtgat tatgcagggc 1440 

ggacatgagg tcgagagtga gagggttgct tgctgtgtta taggacgggc a 1491 



<210> 7 
<211> 497 
<212> PRT 

<213> Perna canaliculus 
<400> 7 

Asp Gly Glu Gin Cys Asn Asp Gly Gin Asn Lys Asp Asp His His Asp 
15 10 15 

Asp His His Asp Asp His His Asp Asp His Asp Asp Asp Asp Glu Thr 
20 25 30 

Met His Tyr Ala Gin Cys Glu Met Glu Pro Asn Pro His Met Ala Ser 
35 40 45 

Ser Leu His His His Val His Gly Ser He Glu Leu Ser Gin Lys Gly 
50 55 60 

His Gly Ala Val Tyr Leu Glu Leu His Leu Val Gly Phe Asn Thr Ser 
65 70 75 80 

Glu Asp His Asp Asp His His His Gly Leu His Leu His Met Leu Gly 
85 90 95 

Asp Met Ser Ala Gly Cys Asp Ser He Gly Glu Leu Tyr Asn Ala His 
100 105 HO 

Pro Glu Lys His Ala Asp Pro Gly Asp Leu Gly Asp Leu Val Asp Asp 
115 120 125 

Asp Arg Gly Val Val Asn Glu Val His His Tyr Ala Trp Leu Asp He 
130 135 140 

Asp Gly Thr Ala Pro Asn Thr Glu Ala Leu He Gly His Ser Met Thr 
145 150 155 160 



He Leu Gin Gly Ser His Thr Asp Ala Asp Thr Pro Ala Ser Arg He 
165 170 175 



Ala Cys Cys Val He Gly His Gly Lys Ala Arg Pro Glu Thr Ala Ala 
180 185 190 



Ala Leu His His Glu Leu Glu Glu Asp Lys Thr Glu His Tyr Ala His 
195 200 - 205 

Cys Asp Val Arg Ser Asn Thr His Gin Pro Lys Ala Leu His His 'His 
210 215 220 

Val His Gly Thr He Asp Phe Lys Gin Val Gly Tyr Gly Asp Leu Glu 
225 230 235 240 

Val Ser Tyr His Leu Glu Gly Phe Asn Val Ser Asp Asp His Lys Asp 
245 250 255 

His Leu His Asp Val Gin He Tyr Ala Asn Gly Asp Leu Thr Ser Gly 
260 265 270 

Cys Asp Asn Leu Gly Ala Lys Tyr Asp Pro His Glu Asp Tyr His Ser 
275 280 285 

Glu Leu Gly Asp Leu Gly Asp He His Asp Asp Asp His Gly Val Val 
290 295 300 

Asn Glu Ser His Arg Tyr Ser Trp He Asn He Phe Gly Asp Asp Ser 
305 310 315 320 

Val Leu Gly Arg Ser He Ala He His Gin Arg Asp His Leu His Lys 
325 330 335 

Ser Ala Lys He Ala Cys Cys Val He Gly Arg Gly Gin Ser His Pro 
340 345 350 

Glu He Val His Arg Ala Lys Cys Val Val Arg Pro Asn Thr Glu Ser 
355 360 365 

Thr Gly Leu His His His Val Ser Gly Ser He Thr Phe Glu Gin Thr 
370 375 380 

Pro Gly Gly Ser Thr His Met Thr Ala Asp Leu Lys Gly Phe Asn Val 
385 390 395 400 

Ser Glu Asp Leu Ser His His Arg His Gly Val Gin Leu His Glu Trp 
405 410 415 

Gly Asp Met Ser His Gly Cys His Ser Leu Gly Arg Met Tyr His Gly 
420 425 430 

His Asp Asp Ala His Asp Pro Lys Arg Pro Gly Asp Leu Gly Asp Val 
435 440 445 

He Asp Asp Ser His Gly He Val His Ser Thr Arg Thr Phe Asp His 
450 455 460 



Leu Asn Val Glu Asp Leu Asn Ala Arg Ser Leu Val He Met Gin Gly 
465 470 475 480 



Gly His Glu Val Glu Ser Glu Arg Val Ala Cys Cys Val He Gly Arg 
485 490 495 



Ala 

<210> 8 

<211> 1611 

<212> DNA 

<213> Perna canaliculus 

<220> 

<221> misc feature 



<222> (1) . 
<223> 'n' 


- (1611) 
can be any 


one of the 


nucleotides 


•a', ■C, 


* g ' or ' t 1 ; 




<400> 8 
gayggggagc 


agtgtaacga 


tgggcagaac 


aaagatgacc 


accatgacga 


ccaccacgat 


D U 


gatcaccatg acgaccatga 


tgatgatgat 


gaaacaatgc 


actatgccca gtgtgaaatg 




gaaccaaacc 


ctcatatggc 


tagcagcctt 


caccaccatg 


tccatggcag 


catagagttg 


180 


tcacagaagg 


gtcatggagc 


tgtttatcta 


gaacttcatc 


ttgtcggatt 


caacacaagt 


24 0 


gaagaccatg 


acgaccacca 


tcatggactt 


catctgcaca 


tgcttggtga 


catgtcagca 


300 


ggttgtgatt 


ctattggcga 


actgtacaat 


gctcacccag 


aaaaacatgc 


tgaccctggt 


360 


gacctcggtg 


acctggttga 


cgatgatagg 


ggcgtggtta 


atgaagttca 


tcattatgct 


A O A 

4z U 


tggttggaca 


ttgatggtac 


agcaccaaac 


accgaagctc 


tcattggaca 


ctcaatgact 


A O A 


attttacaag 


ggagtcacac 


cgatgctgat 


accccagcca 


gtagaatcgc 


ctgttgtgtt 


540 


attggtcatg 


gaaaagctcg 


cccagaaaca 


gcagctgctc 


tacatcacga 


gctagaggaa 


f A A 

oUU 


gataaaactg 


agcattatgc 


ccattgtgac 


gtaagatcta 


atacacacca 


accaaaggct 


660 


cttcatcatc 


atgtccacgg 


aaccatcgat 


ttcaaacaag 


ttggttatgg 


tgaccttgaa 


720 


gtgtcctacc 


atttagaggg 


atttaatgta 


agtgatgacc 


acaaagatca 


tctccatgac 


780 


gtacagatct 


acgccaacgg 


tgacctgacc 


agtggatgtg 


ataacctcgg 


tgctaaatat 


840 


gatcctcatg 


aagattacca 


cagtgagttg 


ggtgatctag 


gagatattca 


cgatgatgac 


900 


catggcgttg 


tcaatgaaag 


ccacagatat 


tcctggatca 


atatcttcgg 


tgatgacagt 


960 


gtcctgggac 


gttctattgc 


cattcaccaa 


agagaccatc 


ttcataaaag 


tgccaaaatt 


1020 


gcctgttgtg 


tcataggacg 


tggacagagc 


catccagaaa 


ttgttcacag 


agctaaatgt 


1080 


gttgtcagac 


ctaatacaga 


atctactggt 


ttacatcacc 


atgtctctgg 


ttctataaca 


1140 


ttcgaacaga 


cccctggagg 


atcaacacat 


atgacggctg 


atctcaaagg 


atttaacgtt 


1200 



agtgaggact tgtcacatca tcgtcatggt gtgcagctcc atgaatgggc agatatgtcc 12 6 0 



catggctgtc actccttagg cagaatgtac catggtcatg atgatgctca cgaccccaaa 1320 

agacctggtg accttggtga tgttatagat gattcccatg gcatcgttCc ctcaactaga 1380 

acctttgatc atcttaatgt tgaagatctt aacgcacgtt cccttgtgat tatgcagggc 1440 

ggacatgagg tcgagagtga gagggttgct tgctgtgtta taggacgggc atgaataacc 150 0 

tcactagagt gactttgtct aacatgacaa ttaacaattg tataacttcc :taaaaaata 1560 

aaacaatgac acaatgnaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa a 1611 



<210> 9 

<211> 8 

<212> PRT 

<213> terrestrial leech 

<400> 9 

Gly Gin Ser Cys Asn Asp Gly Gin 
1 5 

<210> 10 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> primer 
<220> 

<221> Tnisc_f eature 

<222> (1) . . (29) 

<223> 'n' can be any one of the nucleotides 'a', 'c' 'g 1 or 't 1 ; 

'y 1 is a pyrimidine nucleotide and 'r' is a pi ine nucleotide ; 



<400> 10 

gayggngarc artgyaayga yggncaraa 
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SERINE PROTEASE INHIBITOR 

This invention relates to a protein and compositions which contain it. More 
particularly, it relates to a protein which inter alia exhibits activity as a metal cation 
5 binding agent and as an anti- thrombin agent. 

BACKGROUND 

Thrombin is a serine protease involved in blood coagulation. It has specificity for the 
10 cleavage of arginine-lysine bonds as well as cleaving an arginine- threonine bond in 
pro-thrombin, releasing pre-thrombin which is subsequently cleaved to produce 
active thrombin. This active thrombin can then release more thrombin from pro- 
thrombin. In blood clotting and coagulation, thrombin cleaves fibrinopeptide B from 
fibrinogen as well as converting blood factors IX to IXa, V to Va, VIII to Villa and XIII 
15 toXIIIa. 

Inhibitors of thrombin therefore inhibit coagulation and have application in any 
procedure where coagulation is undesirable. One such application is in the 
collection and storage of blood products. Another is in medicaments for preventing 
20 or reducing coagulation for example in treating or preventing cardiac malfunctions. 

Anti- thrombin agents are known. One example is anti- thrombin III (AT-III). 
However, AT-III is capable of effectively inhibiting thrombin only in the presence of 
heparin. 

25 

The applicants have now identified a novel protein which has a range of activities, 
including anti-thrombin activity, and which when active against thrombin does not 
require heparin as a cofactor. It is towards this protein that the present invention is 
broadly directed. 

30 

SUMMARY OF THE INVENTION 

In a first aspect, the present invention provides an isolated protein which has a 
molecular weight of about 55 kDa and an amino acid sequence which includes one 
or more of the following: 
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(a) DGEQCNDGQN (SEQ ID NO. 1) 

(b) QGGHEVESERVACCVIGRA (SEQ ID NO. 2) 

(c) GQSHPEIVH (SEQ ID NO. 3) 

(d) YHGHDDA (SEQ ID NO. 4) 

(e) WNEVHH (SEQ ID NO. 5), 



or an active fragment thereof. 

In a further aspect, the invention provides 
amino acid sequence of 



an isolated protein which comprises the 
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or an active fragment thereof. 

In yet a further aspect, the invention provides an isolated protein which is 
obtainable from the haemolymph of Perna canaliculus which has an apparent 
30 molecular weight of 75 kda determined by PAGE, or an active fragment thereof. 

Conveniently, said protein or fragment has activity as: 



(i) a serine protease inhibitor; or 
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(ii) a divalent cation binding agent. 

The invention further provides a protein which is a functionally equivalent variant of 
a protein or fragment as defined above. 

Still further, the invention provides a protein which is obtainable from a shellfish 
5 other than Perna canaliculus and which is a functionally equivalent variant of a 
protein or fragment as defined above. 

In another aspect, the invention provides a polynucleotide encoding a protein or 
fragment as defined above. 

The polynucleotide may comprise the nucleotide sequence of 

10 

5 ' GAYGGGGAGCAGTGTAACGATGGGCAGAACAAAGATGACCACCATGACGA 
CCACCACGATGATCACCATGACGACCATGATGATGATGATGAAACAATGCACT 
ATGCCCAGTGTGAAATGGAACCAAACCCTCATATGGCTAGCAGCCTTCACCA 
CCATGTCCATGGCAGCATAGAGTTGTCACAGAAGGGTCATGGAGCTGTTTAT 

1 5 CTAGAACTTC ATCTTGTCGGATTCAAC AC AAGTGAAGACC ATGACGACCACCA 

TCATGGACTTCATCTGCACATGCTTGGTGACATGTCAGCAGGTTGTGATTCTA 
TTGGCGAACTGTACAATGCTCACCCAGAAAAACATGCTGACCCTGGTGACCT 
CGGTGACCTGGTTGACGATGATAGGGGCGTGGTTAATGAAGTTCATCATTATG 
CTTGGTTGGACATTGATGGTACAGCACCAAACACCGAAGCTCTCATTGGACA 

2 0 CTCAATGACTATTTTAC AAGGGAGTC AC ACCGATGCTGATACCCC AGCC AGTA 

GAATCGCCTGTTGTGTTATTGGTCATGGAAAAGCTCGCCCAGAAACAGCAGC 
TGCTCTACATCACGAGCTAGAGGAAGATAAAACTGAGCATTATGCCCATTGTG 
ACGTAAGATCTAATACACACCAACCAAAGGCTCTTCATCATCATGTCCACGGA 
ACCATCGATTTCAAACAAGTTGGTTATGGTGAC^ 

25 GAGGGATTTAATGTAAGTGATGACCACAAAGATCATCTCCATGACGTACAGAT 
CTACGCCAACGGTGACCTGACCAGTGGATGTGATAACCTCGGTGCTAAATAT 
GATCCTCATGAAGATTACCACAGTGAGTTGGGTGATCTAGGAGATATTCACGA 
TGATGACCATGGCGTTGTCAATGAAAGCCACAGATATTCCTGGATCAATATCT 
TCGGTGATGACAGTGTCCTGGGACGTTCTATTGCCATTCACCAAAGAGACCAT 

30 CTTCATAAAAGTGCCAAAATTGCCTGTTGTGTCATAGGACGTGGACAGAGCCA 
TCCAGAAATTGTTCACAGAGCTAAATGTGTTGTCAGACCTAATACAGAATCTAC 
TGGTTTACATCACCATGTCTCTGGTTCTATAACATTCGAACAGACCCCTGGAG 
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GATCAACACATATGACGGCTGATCTCAAAGGATTTAACGTTAGTGAGGACTTG 
TCACATCATCGTCATGGTGTGCAGCTCCATGAATGGGGAGATATGTCCCATG 
GCTGTCACTCCTTAGGCAGAATGTACCATGGTCATGATGATGCTCATGACCCC 
AAAAGACCTGGTGACCTTGGTGATGTTATAGATGATTCCCATGGCATCGTTCA 
5 TTCAACTAGAACCTTTGATCATCTTAATGTTGAAGATCTTAACGCACGTTCCCT 
TGTGATTATGCAGGGCGGACATGAGGTCGAGAGTGAGAGGGTTGCTTGCTGT 
GTTATAGGACGGGCA (SEQ ID NO. 6) 

or a variant thereof. 

Still further, the invention provides a vector or construct which includes a 
10 polynucleotide as defined above. 

In another aspect, the invention provides a composition which comprises a protein 
or fragment as defined above. 

The composition may be a medicament, a food f a dietary supplement, (optionally 
including the protein associated with or bound to at least one divalent cation of 
15 dietary significance) or a bioremediation agent. 

In still another aspect, the invention provides a process for obtaining a protein as 
defined above which comprises the step of centrifuging material containing Perna 
canaliculus haemolymph or an extract thereof and recovering the sedimented 
protein. 

20 

DESCRIPTION OF THE DRAWINGS 

While the present invention is broadly as defined above, it also includes 
embodiments of which the following description provides examples. In particular, a 
25 better understanding of the present invention will be gained through reference to 
the accompanying drawings in which 

Figure 1: Purification of pernin from mussel haemoiymph 

30 a) light-scattering band following centrifugation of P. canaliculus haemolymph 

in CsCl; haemolymph was first centrifuged at low speed to remove 
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haemocytes and then at high speed; the re-suspended pellet was then 
centrifuged in CsCl. 

b) UV absorption profile (254 nm wavelength) from fractionation of the CsCl 
5 gradient; the light- scattering material in figure la appears as a peak. 

c) protein composition in 1 ml fractions of a CsCl gradient following 
electrophoresis in a 12% polyacrylamide gel; the heavily stained (Coomassie) 
bands coincide with the position of the light- scattering and UV-absorbing 

10 regions of the gradient; the molecular weight was approximately 75 kDa as 

compared with polypeptide molecular weight standards (lane 6) (refer Figure 
4a for standards). Lanes 1-5 and 7-9 contained samples from the CsCl 
gradient. 

15 Figure 2: Virus-like particles observed by transmission electron microscopy of 
material in light scattering band in a CsCl gradient. Bar in micrograph represents 
100 nm. 

Figure 3: HPLC elution profile of pernin at 280 nm wavelength purified by CsCl 
20 gradient centrifugation.. 

Figure 4: SDS-PAGE profiles (12% gels) of aggregating protein species from P. 
canaliculus and other shellfish species 

25 a) proteins extracted from whole shellfish and purified as described in 

Materials and Methods: lane 1: molecular weight standards (Bio-Rad, USA) 
:pb phosphorylase B, 97.4 kDa; bsa bovine serum albumin, 66 kDa; ova 
ovalbumin, 45 kDa; ca carbonic anhydrase, 31 kDa; lane 2: Greenshell™ 
mussel P. canaliculus; lane 3: blue mussel Mytilis edulis; lane 4: oyster 

30 Crassostrea gigas; lane 5: pipis Paphies australis. 

b) PAGE analysis of human transferrin (Sigma, USA, MW ca. 80 kDa), a 
glycosylated protein, and pernin from P. canaliculus following treatment with 
endoglycosidase-F: lane 1: untreated transferrin; lane 2: transferrin treated 
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with glycosidase-F; lane 3: untreated pernin lane 4: pernin treated with 
glycosidase-F. 

Figure 5: Activity of P. canaliculus haemolymph protein following centrifugation in a 
5 30 kDa molecular weight exclusion filter for 10 min at 1000 g (Ultrafree-MC filter, 
30,000 MW exclusion, Millipore, USA) 

a) SDS-PAGE profile of haemolymph protein at various stages of purification. 
Lane 1: "crude" haemolymph (haemocytes removed); lane 2: resuspended 
10 pellet after ultracentrifugation of "crude" haemolymph for 80 min at 250,000 

g- f lane 3: pernin retentate; lane 4: filtrate (no proteins evident); lane 5: 
molecular weight markers, (refer Figure 4a); lanes 6,7: 10-fold dilutions of 
samples from lanes 2 and 3. 

15 b) Anti-thrombin activity of 30,000 MW exclusion filter retentate and filtrate. 

con+ = the standard 1/41 dilution of human plasma (i.e. standard 
anti-thrombin III activity); 

con - thrombin with no added plasma (buffer control); filtrate: 
20 material passed through a 30,000 MW exclusion filter; 

retentate: pernin protein retained by exclusion filter. 

DESCRIPTION OF THE INVENTION 

25 As broadly outlined above, in one aspect the present invention provides a novel 
protein. The protein of the invention has an apparent molecular weight of 75 kDa, 
calculated by polyacrylamide gel electrophoresis (PAGE). The molecular weight 
inferred from the gene sequence is approximately 55 kDa. 

30 One specific protein of the invention was initially identified as an extract from the 
New Zealand green lipped mussel P, canaliculus. It is therefore obtainable by 
extraction directly from P. canaliculus. 

This protein has the amino acid sequence of SEQ ID NO. 7. 

35 
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The protein of the invention can include its entire native amino acid sequence or 
can include only parts of that sequence where such parts constitute fragments 
which remain biologically active (active fragments). Such activity will normally be 
as a serine protease inhibitor or a divalent cation binding agent but is not restricted 
5 to these activities. 



The invention also includes within its scope functionally equivalent variants of the 
protein of SEQ ID NO. 7. 

10 The phrase "functionally equivalent variants" recognises that it is possible to vary 
the amino acid of a protein while retaining substantially equivalent functionality. 
For example, a protein can be considered a functional equivalent of another protein 
for a specific function if the equivalent peptide is immunologically cross-reactive 
with and has at least substantially the same function as the original protein. 

15 

The functionally equivalent protein need not be the same size as the original. The 
equivalent can be, for example, a fragment of the protein, a fusion of the protein 
with another protein or carrier, or a fusion of a fragment with additional amino 
acids. It is also possible to substitute amino acids in a sequence with equivalent 
20 amino acids using conventional techniques. Groups of amino acids normally held to 
be equivalent are: 



25 



(a) Ala, Ser, Thr, Pro, Gly; 

(b) Asn, Asp, Glu, Gin; 

(c) His, Arg, Lys; 

(d) Met, Leu, He, Val; and 

(e) Phe, Tyr, Trp. 



Polypeptide sequences may be aligned, and percentage of identical amino acids in a 
30 specified region may be determined against another sequence, using computer 
algorithms that are publicly available. The similarity of polypeptide sequences may 
be examined using the BLASTP algorithm. BLASTP software is available on the NCBI 
anonymous FTP server (ftp://ncbLnlm.nih.gov) under /blast/ executables/. The use 
of the BLAST family of algorithms, including BLASTP, is described at NCBI's website 
35 at URL http: / /www^ncbi.nlm.nih.gov/BLAST/newblast.html and in the publication 
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of Altschul, Stephen F., et at (1997), "Gapped BLAST and PSI-BLAST: a new 
generation of protein database search programs", Nucleic Acids Res. 25:3389-34023. 

The protein of the invention together with its active fragments and other variants 
5 may be generated by synthetic or recombinant means. Synthetic polypeptides 
having fewer than about 100 amino acids, and generally fewer than about 50 amino 
acids, may be generated by techniques well known to those of ordinary skill in the 
art. For example, such peptides may be synthesised using any of the commercially 
available solid-phase techniques such as the Merryfield solid phase synthesis 
10 method, where amino acids are sequentially added to a growing amino acid chain 
(see Merryfield, J. Am. Chem. Soc 85: 2146-2149 (1963)). Equipment for automative 
synthesis of peptides is commercially available from suppliers such as Perkin 
Elmer/ Applied Biosystems, Inc. and may be operated according to the 
manufacturers instructions. 

15 

The protein, or a fragment or variant thereof, may also be produced recombinantly 
by inserting a polynucleotide (usually DNA) sequence that encodes the protein into 
an expression vector and expressing the protein in an appropriate host. Any of a 
variety of expression vectors known to those of ordinary skill in the art may be 

20 employed. Expression may be achieved in any appropriate host cell that has been 
transformed or transfected with an expression vector containing a DNA molecule 
which encodes the recombinant protein. Suitable host cells includes procaryotes, 
yeasts and higher eukaryotic cells. Preferably, the host cells employed are £. coZi, 
yeasts or a mammalian cell line such as COS or CHO, or an insect cell line, such as 

25 SF9, using a baculovirus expression vector. The DNA sequence expressed in this 
matter may encode the naturally occurring protein, fragments of the naturally 
occurring protein or variants thereof. 

DNA sequences encoding the protein or fragments may be obtained by screening an 
30 appropriate P. canaliculus cDNA or genomic DNA library for DNA sequences that 
hybridise to degenerate oligonucleotides derived from partial amino acid sequences 
of the protein. Suitable degenerate oligonucleotides may be designed and 
synthesised by standard techniques and the screen may be performed as described, 
for example, in Maniatis et al Molecular Cloning - A Laboratory Manual, Cold 
35 Spring Harbour Laboratories, Cold Spring Harbour, NY (1989). The polymerase 
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chain reaction (PCR) may be employed to isolate a nucleic acid probe from genomic 
DNA, a cDNA or genomic DNA library. The library screen may then be performed 
using the isolated probe. 

Variants of the protein may be prepared using standard mutagenesis techniques 
such as oligonucleotide-directed site specific mutagenesis. 

A specific polynucleotide of the invention has the nucleotide sequence of SEQ ID 
NO. 6 as follows: 

5 ' GAYGGGGAGCAGTGTAACGATGGGCAGAACAAAGATGACCACCATGACGA 

CCACCACGATGATCACCATGACGACCATGATGATGATGATGAAACAATGCACT 

ATGCCCAGTGTGAAATGGAACCAAACCCTCATATGGCTAGCAGCCTTCACCA 

CCATGTCCATGGCAGCATAGAGTTGTCACAGAAGGGTCATGGAGCTGTTTAT 

CTAGAACTTCATCTTGTCGGATTCAACACAAGTGAAGACCATGACGACCACCA 

TCATGGACTTCATCTGCACATGCTTGGTGACATGTCAGCAGGTTGTGATTCTA 

TTGGCGAACTGTACAATGCTCACCCAGAAAAACATGCTGACCCTGGTGACCT 

CGGTGACCTGGTTGACGATGATAGGGGCGTGGTTAATGAAGTTCATCATTATG 

CTTGGTTGGACATTGATGGTACAGCACCAAACACCGAAGCTCTCATTGGACA 

CTCAATGACTATTTTACAAGGGAGTCACACCGATGCTGATACCCCAGCCAGTA 

GAATCGCCTGTTGTGTTATTGGTCATGGAAAAGCTCGCCCAGAAACAGCAGC 

TGCTCTACATCACGAGCTAGAGGAAGATAAAACTGAGCATTATGCCCATTGTG 

ACGTAAGATCTAATACACACCAACCAAAGGCTCTTCATCATCATGTCCACGGA 

ACCATCGATTTCAAACAAGTTGGTTATGGTGACCTTGAAGTGTCCTACCATTTA 

GAGGGATTTAATGTAAGTGATGACCACAAAGATCATCTCCATGACGTACAGAT 

CTACGCCAACGGTGACCTGACCAGTGGATGTGATAACCTCGGTGCTAAATAT 

GATCCTCATGAAGATTACCACAGTGAGTTGGGTGATCTAGGAGATATTCACGA 

TGATGACCATGGCGTTGTCAATGAAAGCCACAGATATTCCTGGATCAATATCT 

TCGGTGATGACAGTGTCCTGGGACGTTCTATTGCCATTCACCAAAGAGACCAT 

CTTCATAAAAGTGCCAAAATTGCCTGTTGTGTCATAGGACGTGGACAGAGCCA 

TCCAGAAATTGTTCACAGAGCTAAATGTGTTGTCAGACCTAATACAGAATCTAC 

TGGTTTACATCACCATGTCTCTGGTTCTATAACATTCGAACAGACCCCTGGAG 

GATCAACACATATGACGGCTGATCTCAAAGGATTTAACGTTAGTGAGGACTTG 

TCACATCATCGTCATGGTGTGCAGCTCCATGAATGGGGAGATATGTCCCATG 

GCTGTCACTCCTTAGGCAGAATGTACCATGGTCATGATGATGCTCATGACCCC 

AAAAGACCTGGTGACCTTGGTGATGTTATAGATGATTCCCATGGCATCGTTCA 
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TTCAACTAGAACCTTTGATCATCTTAATGTTGAAGATCTTAACGCACGTTCCCT 
TGTGATTATGCAGGGCGGACATGAGGTCGAGAGTGAGAGGGTTGCTTGCTGT 
GTTATAGGACGGGCA. 

A further polynucleotide has the sequence of SEQ ID NO. 8 as follows: 

5'GAYGGGGAGCAGTGTAACGATGGGCAGAACAAAGATGACCACCATGACGA 

CCACCACGATGATCACCATGACGACCATGATGATGATGATGAAACAATGCACT 

ATGCCCAGTGTGAAATGGAACCAAACCCTCATATGGCTAGCAGCCTTCACCA 

CCATGTCCATGGCAGCATAGAGTTGTCACAGAAGGGTCATGGAGCTGTTTAT 

CTAGAACTTCATCTTGTCGGATTCAACACAAGTGAAGACCATGACGACCACCA 

TCATGGACTTCATCTGCACATGCTTGGTGACATGTCAGCAGGTTGTGATTCTA 

TTGGCGAACTGTACAATGCTCACCCAGAAAAACATGCTGACCCTGGTGACCT 

CGGTGACCTGGTTGACGATGATAGGGGCGTGGTTAATGAAGTTCATCATTATG 

CTTGGTTGGACATTGATGGTACAGCACCAAACACCGAAGCTCTCATTGGACA 

CTCAATGACTATTTTACAAGGGAGTCACACCGATGCTGATACCCCAGCCAGTA 

GAATCGCCTGTTGTGTTATTGGTCATGGAAAAGCTCGCCCAGAAACAGCAGC 

TGCTCTACATCACGAGCTAGAGGAAGATAAAACTGAGCATTATGCCCATTGTG 

ACGTAAGATCTAATACACACCAACCAAAGGCTCTTCATCATCATGTCCACGGA 

ACCATCGATTTCAAACAAGTTGGTTATGGTGACCTTGAAGTGTCCTACCATTTA 

GAGGGATTTAATGTAAGTGATGACCACAAAGATCATCTCCATGACGTACAGAT 

CTACGCCAACGGTGACCTGACCAGTGGATGTGATAACCTCGGTGCTAAATAT 

GATCCTCATGAAGATTACCACAGTGAGTTGGGTGATCTAGGAGATATTCACGA 

TGATGACCATGGCGTTGTCAATGAAAGCCACAGATATTCCTGGATCAATATCT 

TCGGTGATGACAGTGTCCTGGGACGTTCTATTGCCATTCACCAAAGAGACCAT 

CTTCATAAAAGTGCCAAAATTGCCTGTTGTGTCATAGGACGTGGACAGAGCCA 

TCCAGAAATTGTTCACAGAGCTAAATGTGTTGTCAGACCTAATACAGAATCTAC 

TGGTTTACATCACCATGTCTCTGGTTCTATAACATTCGAACAGACCCCTGGAG 

GATCAACACATATGACGGCTGATCTCAAAGGATTTAACGTTAGTGAGGACTTG 

TCACATCATCGTCATGGTGTGCAGCTCCATGAATGGGGAGATATGTCCCATG 

GCTGTCACTCCTTAGGCAGAATGTACCATGGTCATGATGATGCTCATGACCCC 

AAAAGACCTGGTGACCTTGGTGATGTTATAGATGATTCCCATGGCATCGTTCA 

TTCAACTAGAACCTTTGATCATCTTAATGTTGAAGATCTTAACGCACGTTCCCT 

TGTGATTATGCAGGGCGGACATGAGGTCGAGAGTGAGAGGGTTGCTTGCTGT 

GTTATAGGACGGGCATGAATAACCTCACTAGAGTGACTTTGTCTAACATGACA 
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ATTAACAATTGTATAACTTCG 
AAAAAAAAAAAAAAAAAAAAAAAAAAAA3 ' 



with TGA being the opal stop codon and AATAAA the polyadenylation signal. 

5 

Variants or homologues of the above polynucleotide sequences also form part of the 
present invention. Polynucleotide sequences may be aligned, and percentage of 
identical nucleotides in a specified region may be determined against another 
sequence, using computer algorithms that are publicly available. Two exemplary 

10 algorithms for aligning and identifying the similarity of polynucleotide sequences are 
the BLASTN and FASTA algorithms. The BLASTN software is available on the NCBI 
anonymous FTP server (ftp://ncbi.nlm.nih.gov) under / blast /executables/. The 
BLASTN algorithm version 2.0.4 [Feb-24-1998], set to the default parameters 
described in the documentation and distributed with the algorithm, is preferred for 

15 use in the determination of variants according to the present invention. The use of 
the BLAST family of algorithms, including BLASTN, is described at NCBI's website at 
URL http: / /www.ncbi.nlm.nih.gov/BLAST/newblast.html and in the publication of 
Altschul, Stephen F, et al (1997). "Gapped BLAST and PSI-BLAST: a new generation 
of protein database search programs", Nucleic Acids Res, 25:3389-3402. The 

20 computer algorithm FASTA is available on the Internet at the ftp site 
ftp://ftp.virginia.edu.pub/fasta/. Version 2.0u4, February 1996, set to the default 
parameters described in the documentation and distributed with the algorithm, is 
preferred for use in the determination of variants according to the present invention. 
The use of the FASTA algorithm is described in the W R Pearson and D.J. Lipman, 

25 "Improved Tools for Biological Sequence Analysis," Proa Natl, Acad. Sci. USA 
85:2444-2448 (1988) and W.R. Pearson, "Rapid and Sensitive Sequence Comparison 
with FASTP and FASTA," Methods in Enzymology 183:63-98 (1990). 

All sequences identified as above qualify as "variants" as that term is used herein. 

30 

Variant polynucleotide sequences will generally hybridize to the recited 
polynucleotide sequence under stringent conditions. As used herein, "stringent 
conditions" refers to prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 
65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in 
35 IX SSC, 0.1% SDS at 65°C and two washes of 30 minutes each in 0.2X SSC, 0.1% 
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SDS at 65°C. Such hybridizable sequences include those which code for the 
equivalent protein from sources (such as shellfish) other than P. canaliculus. 

While the above synthetic or recombinant approaches can be taken to produce the 
5 protein of the invention, it is however practicable (and indeed presently preferred) to 
obtain the protein by isolation from P. canaliculus. This reflects the applicants' 
finding that the protein is the dominant protein of the haemolymph of P. canaliculus 
and also that the protein is self-aggregating. It can therefore be isolated in 
commercially significant quantities direct from the mussel itself. For example , 
10 approximately 2 mg of the protein can be obtained per ml of haemolymph. 

Once obtained, the protein is readily purified if desired. This will generally involve 
centrifugation in which the self-aggregating nature of the protein is important. 
Other approaches to purification (eg. chromatography) can however also be followed, 

15 

Furthermore, if viewed as desirable, additional purification steps can be employed 
using approaches which are standard in this art. These approaches are fully able to 
deliver a highly pure preparation of the protein. 

20 Once obtained, the protein and/ or its active fragments can be formulated into a 
composition. The composition can be, for example, a therapeutic composition for 
application as a pharmaceutical, or can be a health or dietary supplement. Again, 
standard approaches can be taken in formulating such compositions. 

25 Still further, the composition can be a food in which the protein and/ or its active 
fragments are included. This can occur by adding the protein to a pre-prepared 
foodstuff, or incorporating the protein into a step of the manufacturing process for 
the food. 

30 The invention will now be described more fully in the following experimental section 
which is provided for illustrative purposes only. 
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EXPERIMENTAL 

Section 1 

5 A. Materials and Methods 

A.1 Shellfish: Perna canaliculus (the New Zealand green-lipped mussel; the 
Greenshell™ mussel) were obtained at retail supermarket outlets or from 
mussel farmers directly; other shellfish species were obtained from retail 
10 outlets except for the blue mussel Mytilis edulis which was supplied by 

Sanford's Fisheries (Havelock, New Zealand). 

A.2 Extracts: Mussel extracts were prepared by homogenising whole, shucked 
mussels (up to 120 mm length) in a commercial food processor with the 

15 addition of 0.02 M sodium phosphate buffer, pH 7.2. Dichloromethane (1/2 

volume) was mixed with the aqueous extract, centrifuged at low speed (6000 
rpm, GSA rotor, Sorvall RC-5B centrifuge at 4 °C). Polyethylene glycol (PEG) 
(MW 6000) was added to the aqueous phase to a final concentration of 10% 
(w/v) and NaCl to 0.5 M and stirred at 4-6 °C overnight. Following low speed 

20 centrifugation the PEG-precipitate was resuspended in approximately 1/10 

volume of sodium phosphate buffer. After another cycle of low-speed 
centrifugation the supernatant was centrifuged at high speed (50,000 rpm 
in a Beckman 60Ti rotor at 4 °C for 60-80 minutes). The resultant pellet was 
resuspended in a small volume of phosphate buffer and clarified by low 

25 speed centrifugation. 

A.3 Polyacrylamide gel electrophoresis: 12% polyaciylamide gels (8 xlO cm; 1 
mm thick) were cast using a prepared stock solution according to the 
manufacturer's instructions (40% acrylamide/bis solution 37.5: 1, Bio-Rad, 

30 USA); commercially available 12% gels (Bio-Rad, USA) were also used. 

Samples (10 (il) were applied to lanes and the gels run at 160 V using a 
standard Tris/Glycine/SDS buffer (Bio-Rad, catalogue 161-0732) until the 
bromphenol blue marker reached the bottom of the gel. Gels were stained 
with BM Fast Stain Coomassie® (Boehringer Mannheim, Germany) and 

35 destained as per the manufacturer's instructions. 
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A.4 Glycosylation test: Samples were treated with N-glycosidase F (PNGase F 
from Flavobacterium meningosepticum; Boehringer Mannheim Biochemica, 
Germany) according to the manufacturer's directions. Treated and 
5 untreated samples were run in a standard 12% polyacrylamide gel. 

A.5 Isopycnic gradients: CsCl (Boehringer Mannhein, Germany) solutions 
were prepared in 0.1 M sodium phosphate buffer, pH 7.2 and filtered 
through a 0.22 \im membrane (Acrodisc, Gelman Sciences, USA) to clarify. 

10 Two step gradients (1.25 g/cc top layer containing the sample and 1.45 g/cc 

bottom layer) were prepared as described by Scotti (1985) and centrifuged 
for approximately 17 hours at 20 °C in a Beckman 70Ti rotor at 30,000 rpm. 
The resultant gradient was fractionated by inserting a 100 jal glass capillary 
tube into the gradient and slowly pumping out the contents. UV absorbance 

15 was monitored by passing through a Uvicord spectrophotometer (LKB 

Produkter, Sweden). Fractions were collected and the refractive indices 
measured using an Abbe refractometer (Bellingham and Stanley, UK) and 
the density estimated using regression equations according to the method 
of Scotti (1985). 

20 

A,6 Porous glass chromatography: Controlled pore glass (CPG 240-80, Sigma 
Chemical Co., USA) was treated according to the suppliers directions. A 1 
cm x 100 cm column (Bio-Rad, USA) was prepared. Samples (1-2 ml) were 
loaded onto the column and eluted with 0. 1 M sodium phosphate buffer, pH 
25 7.2, through a Uvicord spectrophotometer, fractions being collected at 

regular intervals, 

A.7 Estimation of protein concentration: Concentrations were estimated 
using a bovine serum albumin standard (Blot Qualified BSA, Promega, USA) 
30 by UV absorption according to the method of Layne (1957) using the 

equation: mg/ml protein - 1.55*A28o ~ 0.76*A260- Alternatively, 
concentration was estimated by the Bradford reaction using reagent 
supplied by Bio-Rad (USA) at a wavelength of 620 nm.. 
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A. 8 High performance liquid chromatography: Reverse d-phase HPLC was 

performed on an HP 1050 Ti-series HPLC (Hewlett Packard, USA) fitted with 
an analytical 300 A Vydac C-18 column, 25 cm x 4.6 mm i.d.. The 10 jal 
sample in water was eluted with a 0-100% acetonitrile in water (v/v) 
5 gradient over 60 min and the absorption at 2 18 and 280 nm was recorded. 

B. Results 

A light-scattering band was seen after centrifugation of extracts of whole 

10 Greenshell™ mussels in CsCl gradients (Figures la and lb). The density of this 
band was estimated at 1.368 g/cc. A minor band was sometimes observed at 
approximately 1.390 g/cc. If rebanded in CsCl the 1.390 band yielded two bands - 
one at 1.390 g/cc and a second at 1.368 g/cc. SDS-PAGE analysis of fractions of 
either density gave similar polypeptide profiles with a single major band. The 

15 molecular weight of the protein by PAGE was estimated as 75,000 {75 kDa) (Figure 
lc). Several minor bands of higher molecular weight and an additional minor band 
of 45 kDa were also seen. The main band (called pernin) at 75 kDa was always at 
great excess compared to the minor bands. When material from the light-scattering 
material from CsCl gradients were examined by electron microscopy, particles 

20 resembling those of "empty" small RNA viruses were seen (Figure 2). However a UV 
wavelength scan (data not shown) indicated that little, if any, nucleic acid was 
present and that the particles were mainly composed of protein. HPLC showed the 
CsCl band to be composed almost solely of a single species of protein (Figure 3). 
Since HPLC indicated a high degree of purity, the higher molecular weight 

25 polypeptides are presumed to be multimers of pernin. It is likely that the minor, 
lower molecular weight band is degraded pernin. 

Chromatography, on a CPG 240-80 column, of semi-purified extracts, or of material 
banded in CsCl, showed that the majority of pernin was eluted in the exclusion 

30 volume using low molarity phosphate or Tris buffer as the eluent. In contrast, a 
protein of similar size, bovine serum albumin (68 kDa), was included in the column 
matrix. It appears, therefore, that pernin does aggregate into large, particle-like 
structures under certain conditions as suspected from the particles seen in Figure 
2. HPLC confirmed that pernin from P. canaliculus obtained by CPG chromatography 

35 was highly purified. Aggregating protein species were also detected in extracts of 
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other shellfish: the blue mussel Mytilis edulis, the oyster Crassostrea gigas, and 
New Zealand pipis Paphies australis but not in scallops Pecten novaezealandiae. 
These polypeptides were lower in molecular weight than pernin (Figure 4a). The 
pernin from P canaliculus is N-glycosylated as shown by a reduction in molecular 
5 weight when treated with endoglycosidase-F before PAGE (Figure 4b). 



The yield of pernin from whole mussel extractions averaged about 200 pg/mussel. 
Improved yields of pernin were obtained by extracting haemolymph directly from live 
P. canaliculus. A small notch was made in the shell using a triangular file and a 30 

10 gauge needle inserted into the posterior adductor muscle. From 1 to 5 ml of 
haemolymph can be withdrawn easily. The haemolymph was spun at low speed 
(wlOOO to remove haemocytes and the resulting supernatant processed by 
ultracentrifugation, for example at 250,000 g for 40 minutes, followed by either CPG 
chromatography eluting with 0. 1 M sodium phosphate buffer, pH 7.2, or isopycnic 

15 banding in CsCl in phosphate buffer. The pernin obtained in this way appeared no 
different than that purified from whole mussels and had the advantage of a 30-fold 
average increase in yield from each mussel. Haemolymph contained around 2 
mg/ml (average ~5-6 mg/mussel) of pernin which is by far the most predominant 
polypeptide species (Figure 5a). The time to purify pernin was reduced from about 5 

20 days to 1 day. 



25 



Microsequencing of the N-terminal region and internal fragments generated by 
chemical and enzymatic cleavage from purified pernin was performed and generated 
the following sequences of cleavage fragments: 



(a) DGEQCNDGQN 

(b) QGGHEVESERVACCVIGRA 

(c) GQSHPEIVH 

(d) YHGHDDA 
30 (e) WNEVHH. 



These sequences code for amino acids as follows: 
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A 


alanine 


C 


cystine 


D 


aspartic acid 


E 


glutamic acid 


F 


phenylalanine 


G 


glycine 


H 


histidine 


I 


isoleucine 


K 


lysine 


L 


leucine 


M 


methionine 


N 


asparagine 


P 


proline 


Q 


glutamine 


R 


arginine 


S 


serine 


T 


threonine 


V 


valine 


w 


tryptophan 


Y 


tyrosine 



The sequence data was then compared with amino acid sequences in searchable 
computer data bases. Some sequences were found to be of particular interest: 

25 

a) a 10 amino acid residue sequence from the N- terminus of pernin 
(sequence (a) above) showed only homology with an 8 base anti-thrombin protein 
sequence from terrestrial leeches (data from US Patent 5,455,181 Oct 3, 1995: 
sequence 10). 

30 

Perna canaliculus pernin 2 GEQCNDGQ 9 
matching amino acids G+ CNDGQ 

leech anti-thrombin 5 GQSCNDGQ 12 

35 identities: 6/8 (75%) positives: 7/8 (87%); 

indicates an equivalent amino acid; 
the bolded numerals indicate amino acid position 
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b) An internal cleavage product (sequence (b) above) was shown to be have 
homology to the Cu-Zn class of proteins known as "SODs" (superoxide dismutases). 

Each of fragments (a) to (e) are part of the larger pernin amino acid sequence: 



5 



1 


DGEQCNDGQN 


KDDHHDDHHD 


DHHDDHDDDD 


ETMHYAQCEM 


EPNPHMASSL 


5 


HHHVHGS I EL 


S QKGHGAVYL 


ELHLVGFNTS 


EDHDDHHHGL 


HLHMLGDMS A 


0 


GCDSIGELYN 


AH P E KH AD P G 


DLGDLVDDDR 


GVVNEVHHYA 


WLDI DGTAPN 


5 


TEALIGHSMT 


ILQGSHTDAD 


T PASRI ACCV 


I GHGKARPET 


AAALHHELE E 


20 


DKTEHYAHCD 


VRSUTHQP KA 


LHHHVHGT I D 


FKQVGYGDLE 


VS YHLEGFNV 


25 


SDDHKDHLHD 


V Q I YANGDLT 


S GCDNLGAKY 


DPHEDYHSEL 


GDLGD I HDDD 


30 


HGVVNESHRY 


SWINIFGDDS 


VLGRSIAIHQ 


RDHLHKS AKI 


AC C VI GRGQS 


35 


HPEIVHRAKC 


VVRPNTESTG 


LHHHVS GS I T 


FEQTPGGS TH 


MTADLKGFNV 


40 


S EDLSHHRHG 


VQLHEWGDMS 


HGCHSLGRMY 


HGHDD AHD P K 


RPGDLGDVID 


45 


DSHGIVHSTR 


TFDHLNVEDL 


NARSLVIMQG 


GHEVE SERVA 


CCVIGRA 



(Bold characters indicate directly sequenced fragments (a) to (e)). 
10 Section 2 

Anti-thrombin Activity 

The possibility that pernin could function as an anti-thrombin agent was examined 
15 in a kinetic assay for thrombin inhibition. 

Thrombin inhibition assay 

Kinetic assays were done using an Accucolor™ Antithrombin III kit (catalogue no. 

20 CRS105, Sigma Diagnostics, USA) with the reagents prepared according to the 
supplier's directions. Standard plasma was supplied by Instrumentation 
Laboratories (Italy) and used at the recommended dilution of 1/41. Samples of 
purified mussel protein in water were diluted 9/10 by adding 10X Sigma sample 
buffer. Heparin was purchased from Instrumentation Laboratories. Thrombin 

25 activity was estimated colorimetrically at 405 nm using a chromogenic substrate (H- 
D-HHT-L-Ala-L-Arg-pNa.2AcOH, catalogue no. A 8058, Sigma, USA) and a Multiskan 
Biochromatic plate reader (Labsystems, Finland) 

This verified that pernin had inhibitory activity. When a purified preparation of 
30 pernin was centrifuged through a 30,000 MW exclusion filter (Figure 5a), all the 
anti-thrombin activity was in the retentate and no detectable activity was present in 
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the filtrate (Figure 5b). The standard serum was diluted 1/41 as recommended for 
this assay system; the pernin concentration was not determined directly but was in 
the 1 mg/ml range. From this kinetic data pernin inhibition was estimated to be 
about 50% of the level of human plasma (approximately 1 mg/ml pernin diluted 
5 9/10 compared with the 1/41 plasma dilution in the standard ATIII assay system). 
Heparin, a co-factor required for ATIII inhibition of thrombin, was not required for 
inhibitory action by pernin. 

Metal Binding Activity 

10 

Hi Trap® Chelating affinity columns (Amersham Pharmacia Biotech, 1ml size) were 
prepared according to the manufacturer's instructions. The columns were then 
charged with either 0. 1M cupric chloride or zinc chloride before equilibrating in a 
buffer (0.050M sodium phosphate and 0.5M sodium chloride containing O.SmM 

15 imidazole, pH 7.0). Protein samples purified using CsCl centrifugation were 
suspended in this buffer and applied to the column using a chromatographic system 
(Econo System, Bio-Rad Laboratories, USA). Following washing of the column for 5 
mins with buffer during which no protein appeared in the eluate, a linear gradient 
over 20 min at 1 ml/min was used to develop the column using buffer with the 

20 imidazole concentration at lOOmM from 0-100%. The protein eluted into the 
gradient being retained longer on the copper chelation column than the zinc. The 
absorption of the eluate was monitored at 254nM. 

Divalent metal ion content of the CsCl purified protein was determined by dissolving 
25 the protein in water at 10 mg/ml and analysing metal content by both atomic 
absorption and plasma emission spectrometry by comparison with a water blank. 
There was no significant divalent cation content in the protein purified by this 
method. However, purification by other methods not employing chaotropic agents 
like CsCl, the high content of histidine coupled with acidic amino acid residues and 
30 the likely origin of this protein from a SOD precursor, points to pernin having 
endogenous metal ions as part of its native structure. 
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Section 3 

Gene Sequencing Method 

5 A suite of non-specific primers called pUZ5 was synthesised by Gibco-BRL for the 
initial sequencing based on the N-terminal sequence of pernin. The general formula 
was: 

GAY GGN GAR CAR TGY AAY GAY GGN CAR AA 

10 

Where Y represents a pyrimidine base, R represents a purine base and N represents 
any one of the four nucleotide bases. Sequencing was done, initially using pUZ5 
and an oligo-dT based "bottom stand" primer from PCR amplified cDNA. 
Sequencing was done by dye-termination cycle sequencing using "BigDye" prism 
15 technology (Applied Biosystems Incorporated, USA) according to their instructions. 
Products were resolved on an ABI 377 automated sequencer. Following the initial 
sequencing of approximately 500 base pairs pernin-specific primers were 
constructed and used to complete the sequencing of the pernin gene. 

20 This provided the following: 

GAYGGGGAGCAGTGTAACGATGGGCAGAACAAAGATGACCACCATGACGACCACCACGATGATCA 
C C ATGAC GAC C ATGATGAT GAT GATGAAAC AATGCACTATGC C C AGTGTGAAATGGAAC C AAACC 
CTCATATGGCTAGCAGCCTTCACCACCATGTCCATGGCAGCATAGAGTTGTCACAGAAGGGTCAT 

25 GGAGCTGTTTATCTAGAACTTCATCTTGTCGGATTCAACACAAGTGAAGACCATGACGACCACCA 
TCATGGACTTCATCTGCACATGCTTGGTGACATGTCAGCAGGTTGTGATTCTATTGGCGAACTGT 
ACAATGCTCACCCAGAAAAACATGCTGACCCTGGTGACCTCGGTGACCTGGTTGACGATGATAGG 
GGCGTGGTTAATGAAGTTCATCATTATGCTTGGTTGGACATTGATGGTACAGCACCAAACACCGA 
AGCTCTCATTGGACACTCAATGACTATTTTACAAGGGAGTCACACCGATGCTGATACCCCAGCCA 

30 GTAGAATCGCCTGTTGTGTTATTGGTCATGGAAAAGCTCGCCCAGAAACAGCAGCTGCTCTACAT 
C AC GAGCTAGAGGAAGATAAAAC TGAGCAT TATGC C C ATTGT GACGTAAGATCTAATAC AC AC C A 
ACCAAAGGCTCTTCATCATCATGTCCACGGAACCATCGATTTCAAACAAGTTGGTTATGGTGACC 
TTGAAGTGTCCTACCATTTAGAGGGATTTAATGTAAGTGATGACCACAAAGATCATCTCCATGAC 
GTACAGATCTACGCCAACGGTGACCTGACCAGTGGATGTGATAACCTCGGTGCTAAATATGATCC 

35 TCATGAAGATTACCACAGTGAGTTGGGTGATCTAGGAGATATTCACGATGATGACCATGGCGTTG 
TCAATGAAAGCCACAGATATTCCTGGATCAATATCTTCGGTGATGACAGTGTCCTGGGACGTTCT 
ATTGCCATTCACCAAAGAGACCATCTTCATAAAAGTGCCAAAATTGCCTGTTGTGTCATAGGACG 
TGGACAGAGCCATCCAGAAATTGTTCACAGAGCTAAATGTGTTGTCAGACCTAATACAGAATCTA 
CTGGTTTACATCACCATGTCTCTGGTTCTATAACATTCGAACAGACCCCTGGAGGATCAACACAT 
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ATGACGGCTGATCTCAAAGGATTTAACGTTAGTGAGGACTTGTCACATCATCGTCATGGTGTGCA 
GCTCCATGAATGGGGAGATATGTCCCATGGCTGTCACTCCTTAGGCAGAATGTACCATGGTCATG 
ATGATGCTCATGACCCCAAAAGACCTGGTGACCTTGGTGATGTTATAGATGATTCCCATGGCATC 
GTTCATTCAACTAGAACCTTTGATCATCTTAATGTTGAAGATCTTAACGCACGTTCCCTTGTGAT 
5 TATGCAGGGCGGACATGAGGTCGAGAGTGAGAGGGTTGCTTGCTGTGTTATAGGACGGGCATGAA 
TAACCTCACTAGAGTGACTTTGTCTAACATGACAATTAACAATTGTATAACTTCGCTAAAAAATA 
AAACAATGACACAATGNAAAAAAAAAAAAAAAAAAAAAA . 

Discussion 

10 

The present invention is a novel protein obtainable from Perna canaliculus, the 
New Zealand green-lipped (Greenshell™) mussel. The protein appears to be able to 
self-aggregate in structures resembling small virus like particles (VLPs) 
approximately 25 nm in diameter but lacking any nucleic acid. The protein was 

15 found in extracts of whole mussels and appears to be the predominant protein in 
haemolymph. The molecular weight of the protein was estimated to be 75 kDa by 
PAGE and inferred to be 55 kDa from its polynucleotide encoding sequence but, 
because of its ability to aggregate, the protein can be sedimented by 
ultracentrifugation in a short time (e.g. 40 minutes at 250,000 g) whereas the 

20 monomelic protein would not. Each ml of haemolymph yields, on the average, about 
2 mg of pernin. Haemolymph is easily obtained by withdrawing fluid from the 
posterior adductor muscle of the shellfish which can yield up to 5 ml without 
obvious harm; it is not necessary to kill the mussel. The haemolymph obtained not 
only contains high levels of pernin but is quite free of contaminating materials, 

25 particularly compared with whole mussel extracts, so purification of pernin is 
simple. For highly pure preparations of pernin, ultracentrifugation is followed by 
isopycnic banding in a suitable density gradient medium such as CsCL 

The sequence of the N-terminus of pernin suggested that the protein might have 
30 anti-thrombin activity. This was demonstrated in kinetic assays on purified pernin. 
Since thrombin is a serine protease, pernin also acts as a serine protease inhibitor. 

Comparison of the sequences obtained from several cleavage fragments against 
amino acid sequences in a computer database suggest that in addition to the anti- 
35 thrombin activity of pernin, the protein also possesses other activities. One of these 
is the ability to bind divalent cations such as Zn 2+ and Cu 2+ . 
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INDUSTRIAL APPLICATION 

The preferred protein of the invention, pernin, has a number of utilities. 

5 

Because of its anti-thrombin activity pernin is potentially useful as an anti- 
coagulant agent. Thrombin normally acts as a protease which converts fibrinogen in 
the blood to fibrin. Blood coagulation is counteracted by inhibitors, normally anti- 
thrombin III (ATIII); pernin has also been shown to inhibit thrombin activity in an 
10 ATIII assay system. In contrast to ATIII, whose action is accelerated by the presence 
of heparin (a sulphated mucopolysaccharide) pernin does not require heparin as a 
co- factor. 

The pernin protein from P. canaliculus thus has value as a pharmaceutical. Since it 
15 is active as an anticoagulant in its native state it may also be useful as a natural 
therapeutic agent or health supplement. It is readily obtained as a natural product 
in high concentrations from mussel haemolymph. To obtain a highly pure 
preparation it is necessary only to remove haemocytes by centrifugation (or any 
other suitable method) followed by either ultracentrifugation (since pernin forms 
20 aggregates which readily sediment) and resuspension, isopycnic banding in a 
suitable medium such as CsCl, exclusion filtration through a suitable membrane 
which retains pernin, or chromatography through a medium such as controlled pore 
glass of suitable porosity. The result is a highly pure preparation of pernin. 

25 The mussel P. canaliculus produces large amounts of the protein naturally, with 
little cost or effort involved in production, processing or purification. 

A further utility of the protein arises from the fact that pernin can be stripped of 
divalent cations (for example by CsCl isopycnic banding, or pH variation). This 
30 allows for the addition of divalent cations of choice (such as Mg ++ , Cd* + , Zn ++ or Ca ++ ) 
to the metal stripped pernin. Such a protein, with a modified and pre-selected 
divalent cation loading, has application in the food and nutraceutical industries. 

The ability to bind divalent metal cations also gives rise to applications of the 
35 protein in bioremediation and/or cation recovery processes. The divalent cations 
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can be present as contaminants or pollutants in, for example, a solution, and the 
solution passed by a substrate to which the protein is bound so that the cations are 
extracted. 

5 Yet a further utility arises from the fact that the protein is "self-aggregating", and 
can form into structures resembling empty virus-like particles of approximately 
25 nm in diameter. These empty virus-like particles are able to sequester other 
molecules inside them, with the consequent ability to function as delivery vehicles 
for those other molecules. Examples of molecules able to be delivered in this 

10 manner include pharmaceutically active compounds. 

Those persons skilled in the art will understand that the above description is 
provided by way of illustration only and that the invention is limited only by the 
appended claims. 

15 
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CLAIMS: 



1. An isolated protein which has a molecular weight of about 55 kDa and an 
amino acid sequence which includes one or more of the following: 

5 (a) SEQ ID NO. 1 

(b) SEQ ID NO. 2 

(c) SEQ ID NO. 3 

(d) SEQ ID NO. 4 

(e) SEQ ID NO. 5 

10 or an active fragment thereof. 

2. An isolated protein which comprises the amino acid sequence of SEQ ID 
NO. 7, or an active fragment thereof. 

3. An isolated protein which is obtainable from the haemolymph of Perrta 
canaliculus which has an apparent molecular weight of 75 kDa determined 

15 by PAGE, or an active fragment thereof. 

4. A protein or fragment as claimed in any one of claims 1 to 3 which has 
activity as: 

(i) a serine protease inhibitor; or 

(ii) a divalent cation binding agent. 

20 5. A protein or fragment as claimed in claim 4 which has activity as a serine 

protease inhibitor. 

6. A protein or fragment as claimed in claim 4 which has activity as a divalent 

cation binding agent. 



7. A protein which is a functionally equivalent variant of a protein or fragment 

25 as claimed in 5 or 6. 
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8. A protein which is obtainable from a shellfish other than Perna canaliculus 
and which is a functionally equivalent homologue of a protein or fragment 
as claimed in claim 5 or 6. 

9. A polynucleotide encoding a protein or fragment as claimed in any one of 
5 claims 1 to 8. 

10. A polynucleotide as claimed in claim 9 which comprises the nucleotide 
sequence of SEQ ID NO. 6 or a variant thereof. 

11. A polynucleotide which has the nucleotide sequence of SEQ ID NO. 8. 

12. A vector which includes a polynucleotide as claimed in any one of claims 9 
10 to 11. 

13. A host cell which expresses a polynucleotide as claimed in any one of claims 
9 to 11. 

14. A composition which comprises a protein or fragment as claimed in any one 
of claims 1 to 8. 

15 15. A composition as claimed in claim 14 which is a medicament. 

16. A composition as claimed in claim 14 which is a food. 

17. A composition as claimed in claim 14 which is a dietary supplement. 

18. A dietary supplement as claimed in claim 17 in which said protein or 
fragment is associated with or bound to at least one divalent cation of 

20 dietary significance. 

19. A dietary supplement as claimed in claim 18 wherein said divalent metal 
cation is calcium, magnesium or zinc. 

20. A composition as claimed in claim 14 which is a bioremediation agent. 

21. A process for obtaining a protein as claimed in claim 3 which comprises the 
25 step of centrifuging material containing Perna canaliculus haemolymph or 

an extract thereof and recovering the sedimented protein. 
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22. A process as claimed in claim 21 wherein said centrifuging step is 
ultracentrifugation. 

23. A process as claimed in claim 22 wherein said ultracentrifugation is 
performed for about 40 minutes at about 250,000g. 

5 24. A process as claimed in any one of claims 21 to 23 which includes the 
preliminary step of extracting haemolymph from Perna canaliculus. 
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SEQUENCE LISTING 

<110> The Horticulture and Food Research Institute of Ne 

<120> Serine Protease Inhibitor 

<130> 25409 MRB 

<140> 
<141> 

<150> NZ 336906 
<151> 1999-07-23 

<160> 8 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 10 
<212> PRT 

<213> Perna canaliculus 
<400> 1 

Asp Gly Glu Gin Cys Asn Asp Gly Gin Asn 
15 10 



<210> 2 
<211> 19 
<212> PRT 

<213> Perna canaliculus 
<400> 2 

Gin Gly Gly His Glu Val Glu Ser Glu Arg Val Ala Cys Cys Val lie 
15 10 15 

Gly Arg Ala 



<210> 3 
<211> 9 
<212> PRT 

<2I3> Perna canaliculus 



1 
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<400> 3 

Gly Gin Ser His Pro Glu lie Val His 

1 5 



<210> 4 
<211> 7 
<212> PRT 

<213> Perna canaliculus 
<400> 4 

Tyr His Gly His Asp Asp Ala 
1 5 



<210> 5 
<211> 7 
<212> PRT 

<213> Perna canaliculus 
<400> 5 

Val Val Asn Glu Val His His 
1 5 



<210> 6 
<211> 1491 
<212> DNA 

<213> Perna canaliculus 

<220> 

<221> CDS 

<222> (1) . . (1491) 

<400> 6 

gay ggg gag cag tgt aac gat ggg cag 

Asp Gly Glu Gin Cys Asn Asp Gly Gin 

1 5 

gac cac cac gat gat cac cat gac gac 
Asp His His Asp Asp His His Asp Asp 
20 25 

atg cac tat gcc cag tgt gaa atg gaa 
Met His Tyr Ala Gin Cys Glu Met Glu 



aac aaa gat gac cac cat gac 48 
Asn Lys Asp Asp His His Asp 
10 15 

cat gat gat gat gat gaa aca 9 6 
His Asp Asp Asp Asp Glu Thr 

30 

cca aac cct cat atg get age 144 
Pro Asn Pro His Met Ala Ser 



2 



WO 00/39165 



PCT7NZ99/00227 



192 



240 



288 



35 40 45 

age ctt cac cac cat gtc cat ggc age ata gag ttg tea cag aag ggt 
Ser Leu His His His Val His Gly Ser He Glu Leu Ser Gin Lys Gly 
50 55 60 

cat gga get gtt tat eta gaa ctt cat ctt gtc gga ttc aac aca agt 
His Gly Ala Val Tyr Leu Glu Leu His Leu Val Gly Phe Asn Thr Ser 
65 70 75 80 

gaa gac cat gac gac cac cat cat gga ctt cat ctg cac atg ctt ggt 
Glu Asp His Asp Asp His His His Gly Leu His Leu His Met Leu Gly 
85 90 95 

gac atg tea gca ggt tgt gat tct att ggc gaa ctg tac aat get cac 33 6 
Asp Met Ser Ala Gly Cys Asp Ser He Gly Glu Leu Tyr Asn Ala His 
100 105 110 

cca gaa aaa cat get gac cct ggt gac etc ggt gac ctg gtt gac gat 384 
Pro Glu Lys His Ala Asp Pro Gly Asp Leu Gly Asp Leu Val Asp Asp 
115 120 125 

gat agg ggc gtg gtt aat gaa gtt cat cat tat get tgg ttg gac att 
Asp Arg Gly Val Val Asn Glu Val His His Tyr Ala Trp Leu Asp He 
130 135 140 

gat ggt aca gca cca aac acc gaa get etc att gga cac tea atg act 
Asp Gly Thr Ala Pro Asn Thr Glu Ala Leu He Gly His Ser Met Thr 
145 150 155 160 

att tta caa ggg agt cac acc gat get gat acc cca gee agt aga ate 
He Leu Gin Gly Ser His Thr Asp Ala Asp Thr Pro Ala Ser Arg He 
165 170 175 

gec tgt tgt gtt att ggt cat gga aaa get cgc cca gaa aca gca get 
Ala Cys Cys Val He Gly His Gly Lys Ala Arg Pro Glu Thr Ala Ala 
180 185 190 

get eta cat cac gag eta gag gaa gat aaa act gag cat tat gec cat 62 4 
Ala Leu His His Glu Leu Glu Glu Asp Lys Thr Glu His Tyr Ala His 
195 200 205 

tgt gac gta aga tct aat aca cac caa cca aag get ctt cat cat cat 67 2 
Cys Asp Val Arg Ser Asn Thr His Gin Pro Lys Ala Leu His His His 
210 215 220 

gtc cac gga acc ate gat ttc aaa caa gtt ggt tat ggt gac ctt gaa 720 
Val His Gly Thr He Asp Phe Lys Gin Val Gly Tyr Gly Asp Leu Glu 
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480 



528 



576 
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768 



816 



864 



225 230 235 240 

gtg tec tac cat tta gag gga ttt aat gta agt gat gac cac aaa gat 
Val Ser Tyr His Leu Glu Gly Phe Asn Val Ser Asp Asp His Lys Asp 
245 250 255 

cat etc cat gac gta cag ate tac gec aac ggt gac ctg acc agt gga 
His Leu His Asp Val Gin He Tyr Ala Asn Gly Asp Leu Thr Ser Gly 
260 265 270 

tgt gat aac etc ggt get aaa tat gat cct cat gaa gat tac cac agt 
Cys Asp Asn Leu Gly Ala Lys Tyr Asp Pro His Glu Asp Tyr His Ser 
275 280 285 

gag ttg ggt gat eta gga gat att cac gat gat gac cat ggc gtt gtc 912 
Glu Leu Gly Asp Leu Gly Asp He His Asp Asp Asp His Gly Val Val 
290 295 300 

aat gaa age cac aga tat tec tgg ate aat ate ttc ggt gat gac agt 9 60 
Asn Glu Ser His Arg Tyr Ser Trp He Asn He Phe Gly Asp Asp Ser 
305 310 315 320 

gtc ctg gga cgt tct att gee att cac caa aga gac cat ctt cat aaa 
Val Leu Gly Arg Ser He Ala He His Gin Arg Asp His Leu His Lys 
325 330 335 

agt gee aaa att gec tgt tgt gtc ata gga cgt gga cag age cat cca 
Ser Ala Lys He Ala Cys Cys Val He Gly Arg Gly Gin Ser His Pro 
340 345 350 

gaa att gtt cac aga get aaa tgt gtt gtc aga cct aat aca gaa tct 
Glu He Val His Arg Ala Lys Cys Val Val Arg Pro Asn Thr Glu Ser 
355 360 365 

act ggt tta cat cac cat gtc tct ggt tct ata aca ttc gaa cag acc 1152 
Thr Gly Leu His His His Val Ser Gly Ser lie Thr Phe Glu Gin Thr 
370 375 380 



cct gga gga tea aca cat atg acg get gat etc aaa gga ttt aac gtt 
Pro Gly Gly Ser Thr His Met Thr Ala Asp Leu Lys Gly Phe Asn Val 
385 390 395 400 

agt gag gac ttg tea cat cat cgt cat ggt gtg cag etc cat gaa tgg 
Ser Glu Asp Leu Ser His His Arg His Gly Val Gin Leu His Glu Trp 
405 410 415 

gga gat atg tec cat ggc tgt cac tec tta ggc aga atg tac cat ggt 
Gly Asp Met Ser His Gly Cys His Ser Leu Gly Arg Met Tyr His Gly 



1008 



1056 



1104 



1200 



1248 



1296 
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420 425 430 

cat gat gat get cat gac ccc aaa aga cct ggt gac ctt ggt gat gtt 
His Asp Asp Ala His Asp Pro Lys Arg Pro Gly Asp Leu Gly Asp Val 
435 440 445 

ata gat gat tec cat ggc ate gtt cat tea act aga acc ttt gat cat 
He Asp Asp Ser His Gly He Val His Ser Thr Arg Thr Phe Asp His 
450 455 460 

ctt aat gtt gaa gat ctt aac gca cgt tec ctt gtg att atg cag ggc 
Leu Asn Val Glu Asp Leu Asn Ala Arg Ser Leu Val He Met Gin Gly 
465 470 475 480 

gga cat gag gtc gag agt gag agg gtt get tgc tgt gtt ata gga egg 
Gly His Glu Val Glu Ser Glu Arg Val Ala Cys Cys Val He Gly Arg 
485 490 495 



gca 
Ala 



1344 



1392 



1440 



1488 



1491 



<210> 7 
<211> 497 
<212> PRT 

<213> Perna canaliculus 
<400> 7 

Asp Gly Glu Gin Cys Asn Asp Gly Gin Asn Lys Asp Asp His His Asp 
15 10 15 

Asp His His Asp Asp His His Asp Asp His Asp Asp Asp Asp Glu Thr 
20 25 30 

Met His Tyr Ala Gin Cys Glu Met Glu Pro Asn Pro His Met Ala Ser 
35 40 45 

Ser Leu His His His Val His Gly Ser He Glu Leu Ser Gin Lys Gly 
50 55 60 

His Gly Ala Val Tyr Leu Glu Leu His Leu Val Gly Phe Asn Thr Ser 
65 70 75 80 

Glu Asp His Asp Asp His His His Gly Leu His Leu His Met Leu Gly 
85 90 95 

Asp Met Ser Ala Gly Cys Asp Ser He Gly Glu Leu Tyr Asn Ala His 
100 105 110 
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Pro Glu Lys His Ala Asp Pro Gly Asp Leu Gly Asp Leu Val Asp Asp 
115 120 125 

Asp Arg Gly Val Val Asn Glu Val His His Tyr Ala Trp Leu Asp lie 
130 135 140 

Asp Gly Thr Ala Pro Asn Thr Glu Ala Leu lie Gly His Ser Met Thr 
145 150 155 160 

lie Leu Gin Gly Ser His Thr Asp Ala Asp Thr Pro Ala Ser Arg He 
165 170 175 

Ala Cys Cys Val He Gly His Gly Lys Ala Arg Pro Glu Thr Ala Ala 
180 185 190 

Ala Leu His His Glu Leu Glu Glu Asp Lys Thr Glu His Tyr Ala His 
195 200 205 

Cys Asp Val Arg Ser Asn Thr His Gin Pro Lys Ala Leu His His His 
210 215 220 

Val His Gly Thr He Asp Phe Lys Gin Val Gly Tyr Gly Asp Leu Glu 
225 230 235 240 

Val Ser Tyr His Leu Glu Gly Phe Asn Val Ser Asp Asp His Lys Asp 
245 250 255 

His Leu His Asp Val Gin He Tyr Ala Asn Gly Asp Leu Thr Ser Gly 
260 265 270 

Cys Asp Asn Leu Gly Ala Lys Tyr Asp Pro His Glu Asp Tyr His Ser 
275 280 285 

Glu Leu Gly Asp Leu Gly Asp He His Asp Asp Asp His Gly Val Val 
290 295 300 

Asn Glu Ser His Arg Tyr Ser Trp He Asn He Phe Gly Asp Asp Ser 
305 310 315 320 

Val Leu Gly Arg Ser He Ala He His Gin Arg Asp His Leu His Lys 
325 330 335 

Ser Ala Lys He Ala Cys Cys Val He Gly Arg Gly Gin Ser His Pro 
340 345 350 

Glu He Val His Arg Ala Lys Cys Val Val Arg Pro Asn Thr Glu Ser 
355 360 365 
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Thr Gly Leu His His 
370 

Pro Gly Gly Ser Thr 
385 

Ser Glu Asp Leu Ser 
405 

Gly Asp Met Ser His 
420 

His Asp Asp Ala His 
435 

lie Asp Asp Ser His 
450 

Leu Asn Val Glu Asp 
465 

Gly His Glu Val Glu 
485 

Ala 



His Val Ser Gly Ser 
375 

His Met Thr Ala Asp 
390 

His His Arg His Gly 
410 

Gly Cys His Ser Leu 
425 

Asp Pro Lys Arg Pro 
440 

Gly lie Val His Ser 
455 

Leu Asn Ala Arg Ser 
470 

Ser Glu Arg Val Ala 
490 



lie Thr Phe Glu Gin Thr 
380 

Leu Lys Gly Phe Asn Val 
395 400 

Val Gin Leu His Glu Trp 
415 

Gly Arg Met Tyr His Gly 
430 

Gly Asp Leu Gly Asp Val 
445 

Thr Arg Thr Phe Asp His 
460 

Leu Val He Met Gin Gly 
475 480 

Cys Cys Val He Gly Arg 
495 



<210> 8 
<211> 1611 
<212> DNA 

<213> Perna canaliculus 
<220> 

<221> polyA_signal 
<222> (1557) . . (1563) 

<220> 

<221> misc_f eature 
<222> (1492) . . (1494) 
<223> Opal stop codon 

<400> 8 

gayggggagc agtgtaacga tgggcagaac aaagatgacc accatgacga ccaccacgat 60 
gatcaccatg acgaccatga tgatgatgat gaaacaatgc actatgccca gtgtgaaatg 120 
gaaccaaacc ctcatatggc tagcagcctt caccaccatg tccatggcag catagagttg 180 
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tcacagaagg gtcatggagc tgtttatcta gaacttcatc ttgtcggatt caacacaagt 2 40 
gaagaccatg acgaccacca tcatggactt catctgcaca tgcttggtga catgtcagca 3 00 
ggttgtgatt ctattggcga actgtacaat gctcacccag aaaaacatgc tgaccctggt 3 60 
gacctcggtg acctggttga cgatgatagg ggcgtggtta atgaagttca tcattatgct 42 0 
tggttggaca ttgatggtac agcaccaaac accgaagctc tcattggaca ctcaatgact 480 
attttacaag ggagtcacac cgatgctgat accccagcca gtagaatcgc ctgttgtgtt 540 
attggtcatg gaaaagctcg cccagaaaca gcagctgctc tacatcacga gctagaggaa 600 
gataaaactg agcattatgc ccattgtgac gtaagatcta atacacacca accaaaggct 660 
cttcatcatc atgtccacgg aaccatcgat ttcaaacaag ttggttatgg tgaccttgaa 720 
gtgtcctacc atttagaggg atttaatgta agtgatgacc acaaagatca tctccatgac 780 
gtacagatct acgccaacgg tgacccgacc agtggatgtg ataacctcgg tgctaaatat 840 
gatcctcatg aagattacca cagtgagttg ggtgatctag gagatattca cgatgatgac 900 
catggcgttg tcaatgaaag ccacagatat tcctggatca atatcttcgg tgatgacagt 960 
gtcctgggac gttctattgc cattcaccaa agagaccatc ttcataaaag tgccaaaatt 102 0 
gcctgttgtg tcataggacg tggacagagc catccagaaa ttgttcacag agctaaatgt 1080 
gttgtcagac ctaatacaga atctactggt ttacatcacc atgtctctgg ttctataaca 1140 
ttcgaacaga cccctggagg atcaacacat atgacggctg atctcaaagg atttaacgtt 1200 
agtgaggact tgtcacatca tcgtcatggt gtgcagctcc atgaatgggg agatatgtcc 1260 
catggctgtc actccttagg cagaatgtac catggtcatg atgatgctca tgaccccaaa 1320 
agacctggtg accttggtga tgttatagat gattcccatg gcatcgttca ttcaactaga 13 80 
acctttgatc atcttaatgt tgaagatctt aacgcacgtt cccttgtgat tatgcagggc 1440 
ggacatgagg tcgagagtga gagggttgct tgctgtgtta taggacgggc atgaataacc 1500 
tcactagagt gactttgtct aacatgacaa ttaacaattg tataacttcg ctaaaaaata 15 60 
aaacaatgac acaatgnaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa a 1611 
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